Skip to content

Spherical RoPE#2082

Open
csjfwang wants to merge 73 commits into
ecmwf:developfrom
csjfwang:spherical-rope
Open

Spherical RoPE#2082
csjfwang wants to merge 73 commits into
ecmwf:developfrom
csjfwang:spherical-rope

Conversation

@csjfwang
Copy link
Copy Markdown
Contributor

Description

1st draft version of spherical rope.

Issue Number

Is this PR a draft? Mark it as draft.

Checklist before asking for review

  • I have performed a self-review of my code
  • My changes comply with basic sanity checks:
    • I have fixed formatting issues with ./scripts/actions.sh lint
    • I have run unit tests with ./scripts/actions.sh unit-test
    • I have documented my code and I have updated the docstrings.
    • I have added unit tests, if relevant
  • I have tried my changes with data and code:
    • I have run the integration tests with ./scripts/actions.sh integration-test
    • (bigger changes) I have run a full training and I have written in the comment the run_id(s): launch-slurm.py --time 60
    • (bigger changes and experiments) I have shared a hegdedoc in the github issue with all the configurations and runs for this experiments
  • I have informed and aligned with people impacted by my change:
    • for config changes: the MatterMost channels and/or a design doc
    • for changes of dependencies: the MatterMost software development channel

wang85 and others added 30 commits July 16, 2025 10:07
@csjfwang csjfwang marked this pull request as ready for review April 25, 2026 20:17
@csjfwang csjfwang changed the title [Draft] Spherical rope Spherical RoPE Apr 25, 2026
@sophie-xhonneux
Copy link
Copy Markdown
Contributor

@csjfwang Can you please post the relevant hedgedoc link here? So it is easy to find later

@csjfwang
Copy link
Copy Markdown
Contributor Author

csjfwang commented May 5, 2026

@csjfwang Can you please post the relevant hedgedoc link here? So it is easy to find later

Yes! Here:
https://gitlab.jsc.fz-juelich.de/hedgedoc/jTvG95XHRmmIyfTen-beEg?view

Copy link
Copy Markdown
Contributor

@sophie-xhonneux sophie-xhonneux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please allow for backwards compatability the rope_2D option to set rope_mode to 2d. Add a warning to the logger that this will be deprecated.

Also can you please add the performance cost in iterations/second to the hedgedoc or this PR

@sophie-xhonneux
Copy link
Copy Markdown
Contributor

@csjfwang It would be great to get this PR over the line asap!

@csjfwang
Copy link
Copy Markdown
Contributor Author

@csjfwang It would be great to get this PR over the line asap!

Thank you @sophie-xhonneux !
I am looking into the your comments, will solve them today!

@csjfwang
Copy link
Copy Markdown
Contributor Author

csjfwang commented May 11, 2026

Please allow for backwards compatability the rope_2D option to set rope_mode to 2d. Add a warning to the logger that this will be deprecated.

Also can you please add the performance cost in iterations/second to the hedgedoc or this PR

Here are some performance cost:
https://gitlab.jsc.fz-juelich.de/hedgedoc/jTvG95XHRmmIyfTen-beEg?view#Performance-cost---samplessecond-on-1-GPU

I've solved the above two comments, could you please help review this PR again?

healpix_level: 5

rope_2D: False
# Generalized RoPE selector.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would make more sense to have a section positional_encoding in the future

Comment thread config/config_forecasting_eerie.yml Outdated
healpix_level: 5

rope_2D: False
# Generalized RoPE selector.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not add this to all configs. It should default to the case that it's not used (for the moment).

return apply_rotary_pos_emb(q, k, cos, sin, unsqueeze_dim=unsqueeze_dim)


####################################################################################################
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid these non-standard comments. Use a separate file if you think it should be separated better

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @clessig !

I've solved the above issues, could you please have a look again?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

model:pretrain model Related to model training or definition (not generic infra)

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

3 participants